mem: Add faster tiered buffer pool by arjan-bal · Pull Request #8775 · grpc/grpc-go

arjan-bal · 2025-12-17T10:21:57Z

This change adds a new tiered buffer pool that uses power-of-2 tier sizes. It reduces the lookup time for the relevant sizedBufferPool from $O(\log n)$ to $O(1)$, where n is the number of tiers. This creates constant-time lookups independent of the tier count, allowing users to add more tiers without performance overhead.

Benchmarks

Micro-benchmark that measures only the pool query performance, ignoring the allocation time:

func BenchmarkSearch(b *testing.B) {
	defaultBufferPoolSizes := make([]int, len(defaultBufferPoolSizeExponents))
	for i, exp := range defaultBufferPoolSizeExponents {
		defaultBufferPoolSizes[i] = 1 << exp
	}
	b.Run("pool=Tiered", func(b *testing.B) {
		p := NewTieredBufferPool(defaultBufferPoolSizes...).(*tieredBufferPool)
		for b.Loop() {
			for size := range 1 << 19 {
				// One for get, one for put.
				_ = p.getPool(size)
				_ = p.getPool(size)
			}
		}
	})

	b.Run("pool=BinaryTiered", func(b *testing.B) {
		p := NewBinaryTieredBufferPool(defaultBufferPoolSizeExponents...).(*binaryTieredBufferPool)
		for b.Loop() {
			for size := range 1 << 19 {
				_ = p.poolForGet(size)
				_ = p.poolForPut(size)
			}
		}
	})
}

With 5 tiers:

go test -bench=BenchmarkSearch -count=10 -benchmem | benchstat -col '/pool' -
goos: linux
goarch: amd64
pkg: google.golang.org/grpc/mem
cpu: Intel(R) Xeon(R) CPU @ 2.60GHz
          │   Tiered    │            BinaryTiered             │
          │   sec/op    │   sec/op     vs base                │
Search-48   5.353m ± 2%   2.036m ± 0%  -61.97% (p=0.000 n=10)

          │   Tiered   │          BinaryTiered          │
          │    B/op    │    B/op     vs base            │
Search-48   0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

          │   Tiered   │          BinaryTiered          │
          │ allocs/op  │ allocs/op   vs base            │
Search-48   0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

With 9 tiers:

go test -bench=BenchmarkSearch -count=10 -benchmem | benchstat -col '/pool' -
goos: linux
goarch: amd64
pkg: google.golang.org/grpc/mem
cpu: Intel(R) Xeon(R) CPU @ 2.60GHz
          │   Tiered    │            BinaryTiered             │
          │   sec/op    │   sec/op     vs base                │
Search-48   5.659m ± 0%   2.035m ± 0%  -64.04% (p=0.000 n=10)

          │   Tiered   │          BinaryTiered          │
          │    B/op    │    B/op     vs base            │
Search-48   0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

          │   Tiered   │          BinaryTiered          │
          │ allocs/op  │ allocs/op   vs base            │
Search-48   0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

RELEASE NOTES:

mem: Add faster tiered buffer pool. Use NewBinaryTieredBufferPool to create such pools.

codecov · 2025-12-17T10:25:47Z

Codecov Report

❌ Patch coverage is 73.33333% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.66%. Comparing base (81a00ce) to head (3b88c26).
⚠️ Report is 83 commits behind head on master.

Files with missing lines	Patch %	Lines
mem/buffer_pool.go	73.33%	10 Missing and 6 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #8775      +/-   ##
==========================================
- Coverage   83.22%   80.66%   -2.56%     
==========================================
  Files         418      416       -2     
  Lines       32385    33495    +1110     
==========================================
+ Hits        26952    27019      +67     
- Misses       4050     4663     +613     
- Partials     1383     1813     +430

Files with missing lines	Coverage Δ
mem/buffer_pool.go	`81.66% <73.33%> (-14.95%)`	⬇️

... and 85 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

easwars

I thought we decided to make it possible for the user to set the default buffer pool for the whole process through an experimental API, and get rid of the existing dial option and the server option. Was your plan to do that in a follow-up?

mem/buffer_pool.go

arjan-bal · 2025-12-18T10:33:07Z

I thought we decided to make it possible for the user to set the default buffer pool for the whole process through an experimental API, and get rid of the existing dial option and the server option. Was your plan to do that in a follow-up?

I was waiting for the author of #8770 to raise a PR for exposing a function to set the default buffer pool. See the second part of #8770 (comment).

get rid of the existing dial option and the server option

I'm not sure if we want to do this. Maybe people want to use different buffer pools for each channel.

In this PR, I'm just improving the existing buffer pool implementation.

easwars · 2026-01-20T19:29:50Z

Also, would it make sense to add some of the micro benchmarks that you used as part of this PR? Thanks.

mem/buffer_pool.go

arjan-bal · 2026-02-03T17:12:47Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a new binaryTieredBufferPool which provides a significant performance improvement for buffer pool lookups by using power-of-2 tier sizes for O(1) lookups. The implementation is clever, leveraging math/bits for efficient calculations. The accompanying tests are thorough, including architecture-specific checks and benchmarks that demonstrate the performance gains. I've found a minor issue in one of the new benchmark tests that could lead to a panic. Overall, this is an excellent contribution that improves performance and is well-implemented.

mem/buffer_pool_internal_test.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

easwars

LGTM modulo minor nits

easwars · 2026-02-13T07:17:49Z

mem/buffer_pool.go

+var defaultBufferPoolSizeExponents = []uint8{
+	8,
+	goPageSizeExponent,
+	14, // 16KB (max HTTP/2 frame size used by gRPC)
+	15, // 32KB (default buffer size for io.Copy)
+	20, // 1MB
 }

-var defaultBufferPool BufferPool
+var (
+	defaultBufferPool BufferPool
+	uintSize          = bits.UintSize // use a variable for mocking during tests.
+)


Nit: Group all these vars in one block, or don't have a block at all (like you currently do for consts).

Grouped all the vars and consts in a single block.

easwars · 2026-02-13T07:20:49Z

mem/buffer_pool.go

+		// Allocating slices of size > 2^maxExponent isn't possible on
+		// maxExponent-bit machines.
+		if int(exp) > maxExponent {
+			return nil, fmt.Errorf("allocating slice of size 2^%d is not possible", exp)


Nit: prefix the error string with package name?

Added the prefix.

easwars · 2026-02-13T08:00:34Z

mem/buffer_pool.go

+}
+
+func (b *binaryTieredBufferPool) Put(buf *[]byte) {
+	b.poolForPut(cap(*buf)).Put(buf)


Can we add a comment here capturing the subtle but important fact that we are passing the capacity of the buffer, and not the size of the buffer to poolForPut. If we did the latter, all buffers would eventually move to the smallest pool.

Added a comment.

easwars · 2026-02-13T08:01:48Z

mem/buffer_pool_internal_test.go

+		wantErr   bool
+	}{
+		{
+			name:      "32-bit valid exponent",


Nit: Make subtest names more like identifiers. go/go-style/decisions#subtest-names

easwars · 2026-02-13T08:02:45Z

mem/buffer_pool_internal_test.go

+				t.Errorf("NewBinaryTieredBufferPool() error = %t, wantErr %t", err, tt.wantErr)
+				return
+			}
+			if err == nil {


Nit: invert the conditional and return early for tests where an error was expected and was seen.

easwars · 2026-02-13T08:05:15Z

mem/buffer_pool_internal_test.go

+		for b.Loop() {
+			for size := range 1 << 19 {
+				_ = p.poolForGet(size)
+				_ = p.poolForPut(size)


Does it matter that we are not passing capacity here to poolForPut?

It is true that this benchmark doesn't use the buffer's expected capacity, but the results should still be similar. The benchmark intentionally avoids fetching a buffer to ensure we measure only the buffer overhead, excluding allocation time. While we could determine the expected capacity by type-asserting the pool to sizedBufferPool and reading its defaultSize field, this would make the benchmark brittle by relying on a private implementation. Therefore, I am not making the change at this time.

arjan-bal · 2026-02-16T05:52:40Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a new binaryTieredBufferPool which provides O(1) lookup time for buffer pools by using power-of-2 tier sizes. The implementation is well-structured, includes comprehensive tests, and the benchmarks demonstrate a significant performance improvement over the existing tieredBufferPool. I have one suggestion to handle duplicate exponents in the pool configuration to prevent potential resource leaks.

mem/buffer_pool.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

This change adds a new tiered buffer pool that uses power-of-2 tier sizes. It reduces the lookup time for the relevant sizedBufferPool from $O(\log n)$ to $O(1)$, where n is the number of tiers. This creates constant-time lookups independent of the tier count, allowing users to add more tiers without performance overhead. ## Benchmarks Micro-benchmark that measures only the pool query performance, ignoring the allocation time: ```go func BenchmarkSearch(b *testing.B) { defaultBufferPoolSizes := make([]int, len(defaultBufferPoolSizeExponents)) for i, exp := range defaultBufferPoolSizeExponents { defaultBufferPoolSizes[i] = 1 << exp } b.Run("pool=Tiered", func(b *testing.B) { p := NewTieredBufferPool(defaultBufferPoolSizes...).(*tieredBufferPool) for b.Loop() { for size := range 1 << 19 { // One for get, one for put. _ = p.getPool(size) _ = p.getPool(size) } } }) b.Run("pool=BinaryTiered", func(b *testing.B) { p := NewBinaryTieredBufferPool(defaultBufferPoolSizeExponents...).(*binaryTieredBufferPool) for b.Loop() { for size := range 1 << 19 { _ = p.poolForGet(size) _ = p.poolForPut(size) } } }) } ``` With 5 tiers: ```sh go test -bench=BenchmarkSearch -count=10 -benchmem | benchstat -col '/pool' - goos: linux goarch: amd64 pkg: google.golang.org/grpc/mem cpu: Intel(R) Xeon(R) CPU @ 2.60GHz │ Tiered │ BinaryTiered │ │ sec/op │ sec/op vs base │ Search-48 5.353m ± 2% 2.036m ± 0% -61.97% (p=0.000 n=10) │ Tiered │ BinaryTiered │ │ B/op │ B/op vs base │ Search-48 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ ¹ all samples are equal │ Tiered │ BinaryTiered │ │ allocs/op │ allocs/op vs base │ Search-48 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ ¹ all samples are equal ``` With 9 tiers: ```sh go test -bench=BenchmarkSearch -count=10 -benchmem | benchstat -col '/pool' - goos: linux goarch: amd64 pkg: google.golang.org/grpc/mem cpu: Intel(R) Xeon(R) CPU @ 2.60GHz │ Tiered │ BinaryTiered │ │ sec/op │ sec/op vs base │ Search-48 5.659m ± 0% 2.035m ± 0% -64.04% (p=0.000 n=10) │ Tiered │ BinaryTiered │ │ B/op │ B/op vs base │ Search-48 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ ¹ all samples are equal │ Tiered │ BinaryTiered │ │ allocs/op │ allocs/op vs base │ Search-48 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ ¹ all samples are equal ``` RELEASE NOTES: * mem: Add faster tiered buffer pool. Use `NewBinaryTieredBufferPool` to create such pools. --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

ash2k · 2026-03-07T23:29:46Z

I'm not sure if we want to do this. Maybe people want to use different buffer pools for each channel.

Very much agree with this. As a user, I'd prefer to not have any global state at all. I'd prefer all things to be tied to a client or to a server. I can share what I want to share across instances of these by myself. This is a more flexible approach.

arjan-bal added this to the 1.79 Release milestone Dec 17, 2025

arjan-bal added Type: Performance Performance improvements (CPU, network, memory, etc) Area: Transport Includes HTTP/2 client/server and HTTP server handler transports and advanced transport features. labels Dec 17, 2025

arjan-bal mentioned this pull request Dec 17, 2025

mem: Add more tiers to the default buffer pool #8770

Closed

arjan-bal requested review from dfawley and easwars December 17, 2025 10:25

arjan-bal assigned easwars and dfawley Dec 17, 2025

Add binary tiered buffer pool

1f26e66

arjan-bal force-pushed the effecient-tiered-buffer-pool branch from bdccc9b to 1f26e66 Compare December 17, 2025 12:28

easwars reviewed Dec 17, 2025

View reviewed changes

mem/buffer_pool.go Outdated Show resolved Hide resolved

mem/buffer_pool.go Outdated Show resolved Hide resolved

mem/buffer_pool.go Outdated Show resolved Hide resolved

mem/buffer_pool.go Outdated Show resolved Hide resolved

easwars assigned arjan-bal and unassigned easwars and dfawley Dec 17, 2025

better comments, var names

e7b7c57

arjan-bal assigned easwars and unassigned arjan-bal Dec 18, 2025

arjan-bal force-pushed the effecient-tiered-buffer-pool branch from b50277e to 8e7ecc4 Compare December 18, 2025 19:51

Use uint for exponents

36d34f7

arjan-bal force-pushed the effecient-tiered-buffer-pool branch from 8e7ecc4 to 36d34f7 Compare December 18, 2025 19:54

easwars assigned arjan-bal and unassigned easwars Jan 20, 2026

easwars reviewed Jan 20, 2026

View reviewed changes

mem/buffer_pool.go Outdated Show resolved Hide resolved

mem/buffer_pool.go Outdated Show resolved Hide resolved

mem/buffer_pool.go Outdated Show resolved Hide resolved

mem/buffer_pool.go Outdated Show resolved Hide resolved

Pranjali-2501 modified the milestones: 1.79 Release, 1.80 Release Jan 21, 2026

Handle word size, add benchmark, return construction error

e467b97

arjan-bal force-pushed the effecient-tiered-buffer-pool branch from 3f63cea to e467b97 Compare February 3, 2026 11:15

arjan-bal assigned easwars and unassigned arjan-bal Feb 3, 2026

gemini-code-assist bot reviewed Feb 3, 2026

View reviewed changes

mem/buffer_pool_internal_test.go Show resolved Hide resolved

Update mem/buffer_pool_internal_test.go

f32f229

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

arjan-bal force-pushed the effecient-tiered-buffer-pool branch from 1502569 to f32f229 Compare February 3, 2026 18:18

easwars approved these changes Feb 13, 2026

View reviewed changes

easwars assigned arjan-bal and unassigned easwars Feb 13, 2026

error message, test names, comments

f9f1a1d

gemini-code-assist bot reviewed Feb 16, 2026

View reviewed changes

mem/buffer_pool.go Show resolved Hide resolved

arjan-bal and others added 2 commits February 16, 2026 11:34

Apply suggestions from code review

2d102c7

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Add test for duplicates

3b88c26

arjan-bal merged commit 3be7e2d into grpc:master Feb 16, 2026
14 checks passed

arjan-bal deleted the effecient-tiered-buffer-pool branch February 16, 2026 06:25

Conversation

arjan-bal commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Uh oh!

codecov bot commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

easwars left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arjan-bal commented Dec 18, 2025

Uh oh!

easwars commented Jan 20, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arjan-bal commented Feb 3, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

easwars left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arjan-bal commented Feb 16, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

ash2k commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

arjan-bal commented Dec 17, 2025 •

edited

Loading

codecov bot commented Dec 17, 2025 •

edited

Loading